Lattice Field Theory on Cluster Computers: Vector- Vs. Cache-Centric Programming
نویسندگان
چکیده
We evaluate the possibility of moving medium-sized calculation in lattice field theory from vector supercomputers to cluster computers, namely clusters built from Alpha processors and Myrinet interconnect, and find that a mediumsized system with a performance of 10 to 20 GFlop/s can be easily and cost-effectively built from current off-the-shelf components. The performance of the algorithms is analyzed with respect to memory bandwidth problems by experiment and using a cache simulator that uses C++ operator overloading. It seems that cluster systems, while hampered by poor memory bandwidth as compared to supercomputers, might offer opportunities for some algorithms that have good locality but are not vectorizable and thus will not perform well on vector systems.
منابع مشابه
On the Single Processor Performance of Simple Lattice Boltzmann Kernels
This report presents a comprehensive survey of the effect of different data layouts on the single processor performance characteristics for the lattice Boltzmann method both for commodity “off-the-shelf” (COTS) architectures and tailored HPC systems, such as vector computers. We cover modern 64-bit processors ranging from IA32 compatible (Intel Xeon/Nocona, AMD Opteron), superscalar RISC (IBM P...
متن کاملCluster-based In-networking Caching for Content-Centric Networking
With the Internet architecture changing from host-centric communication model to content-centric model, Content Centric Networking (CCN) has emerged. One distinctive feature of CCN infrastructure is in-networking caching. As cache capacities of routers are relatively small compared with delivered data size, one challenge of in-networking caching is how to efficiently use the cache resources. In...
متن کاملFast Parallel I/O on Cluster Computers
Today’s cluster computers suffer from slow I/O, which slows down I/O-intensive applications. We show that fast disk I/O can be achieved by operating a parallel file system over fast networks such as Myrinet or Gigabit Ethernet. In this paper, we demonstrate how the ParaStation3 communication system helps speed-up the performance of parallel I/O on clusters using the open source parallel virtual...
متن کاملApple-CORE: Microgrids of SVP cores
To harness the potential of CMPs for scalable, energy-efficient performance in general-purpose computers, the Apple-CORE project has co-designed a general machine model and concurrency control interface with dedicated hardware support for concurrency management across multiple cores. Its SVP interface combines dataflow synchronisation with imperative programming, towards the efficient use of pa...
متن کاملParallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999